Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Automated ICD Coding Based on Word Embedding with Entry Embedding and Attention Mechanism
ZHANG Hongke, FU Zhenxin, REN Qianping, XU Hui, ZHAO Dongyan, YAN Rui
Acta Scientiarum Naturalium Universitatis Pekinensis    2020, 56 (1): 1-8.   DOI: 10.13209/j.0479-8023.2019.095
Abstract1465)   HTML    PDF(pc) (725KB)(188)       Save
The authors propose a neural model based on word embedding with entry embedding and attention mechanism, which can make full use of the unstructured text in the electronic medical record to achieve automated ICD coding for the main diagnosis of the medical record home page. This method first embeds the words which contain the medical record entries into word embeddings, and enriches word-level representation based on keyword attention. Then, the word attention is used to highlight the role of key words and enhance the text representation. Finally, ICD codes are output by a fully connected neural network classifier. Ablation study on a Chinese electronic medical record data set shows that word embedding with entry embedding, keyword attention and word attention is effective. The proposed model gets the best results for 81 diseases classification compared with baselines and can effectively improve the quality of automated ICD coding.
Related Articles | Metrics | Comments0
A Hybrid Optimization Framework Fusing Word- and Sentence-Level Information for Extractive Summarization
LIN Xinyi, YAN Rui, ZHAO Dongyan
Acta Scientiarum Naturalium Universitatis Pekinensis    2018, 54 (2): 229-235.   DOI: 10.13209/j.0479-8023.2017.148
Abstract1059)   HTML4)    PDF(pc) (487KB)(363)       Save

In order to fuse word-level and sentence-level information from different semantic spaces, the authors propose a hybrid optimization framework to optimize word-level information while simultaneously incorporate sentence-level information as constraints. The optimization is conducted by iterative unit substitutions. The performance on DUC benchmark datasets demonstrates the effectiveness of proposed framework in terms of ROUGE evaluation.

Related Articles | Metrics | Comments0
Discovering Abnormal Data in RDF Knowledge Base
HE Binbin,ZOU Lei,ZHAO Dongyan
Acta Scientiarum Naturalium Universitatis Pekinensis   
Automatic Understanding of Natural Language Questions for Querying Chinese Knowledge Bases
XU Kun,FENG Yansong,ZHAO Dongyan,CHEN Liwei,ZOU Lei
Acta Scientiarum Naturalium Universitatis Pekinensis   
Abstract811)      PDF(pc) (493KB)(532)       Save
A framework to transform natural language questions into computer-understoodable structured queries is presented. The authors propose to use query semantic graph to represent the semantics in Chinese questions, and adopt predicate and entity disambiguation to match the query graph to the schema of a knowledge base. The authors collect a benchmark of 42 frequently-asked questions randomly sampled from 3 categories of Baidu Knows, including person, location and organization. Experiment results show that proposed framework can effectively convert natural language questions into SPARQL queries, and lay a good foundation for the next generation of intelligent question answering systems.
Related Articles | Metrics | Comments0
C-TERN: A Temporal Information Processing Algorithm of Chinese Military News Story Based on Cascade Finite State Automata
WANG Wei,ZHAO Dongyan,SU Tingting
Acta Scientiarum Naturalium Universitatis Pekinensis   
Abstract577)      PDF(pc) (506KB)(280)       Save
The authors propose a new method C-TERN to recognize and normalize the temporal expression in military story based on cascade finite state automata. Firstly, C-TERN recognizes the temporal expression in military story, and layers the temporal information extracted from general language and military language, and recognizes the temporal by layer. Then, in the procedure of temporal expression normalization, C-TERN ratiocinates and normalizes the simple/specify time, duration time, absolute and relative temporal expression in four steps. The method pays special attention to the correctness of the regulation extraction, the dispelling of the collision between regulations, and the reasonability of the matching method. The experimental results on multi-information show that proposed method can recognize and normalize the absolute and relative temporal expression as well as the simple/specify time and duration time effectively. It can better meets the temporal information processing needs in military applications.
Related Articles | Metrics | Comments0
Ontology-Based News Personalized Recommendation
RAO Junyang,JIA Aixia,FENG Yansong,ZHAO Dongyan
Acta Scientiarum Naturalium Universitatis Pekinensis   
Abstract904)      PDF(pc) (1000KB)(312)       Save
The authors concentrate on exploiting the background knowledge to address the semantic analysis in content-based filtering. An Ontology Based Similarity Model (OBSM) is proposed to calculate the news-user similarity through collaboratively built ontological structures. In order to deal with the noisy nature of these coarse-grained structures, an ontology based clustering model is introduced into the framework, called X-OBSM, which clusters concepts of a user profile on a coarse-grained ontology. Experiment results show that both OBSM and X-OBSM outperform the baselines by a large margin, specifically, X-OBSM performs better than OBSM in both quality and efficiency.
Related Articles | Metrics | Comments0
Multiclass Kernel Polarization and Its Application to Parameter Selection of RBF Kernel with Multiple Widths
WANG Tinghua,ZHAO Dongyan,ZHANG Qiong
Acta Scientiarum Naturalium Universitatis Pekinensis   
Identification of Topic Sentence about Key Event in Chinese News
WANG Wei,ZHAO Dongyan,ZHAO Wei
Acta Scientiarum Naturalium Universitatis Pekinensis   
Abstract1010)      PDF(pc) (571KB)(1427)       Save
The authors propose an approach to extract topic sentences that describe key event from a news article. Considering the special structure of news articles, the relations between news articles and key events reported in them is studied, as well as the characteristics of a news headline in three aspects: information, form and language. A novel method based on the information aspect of a headline is used to extract a topic sentence which contains the key event information from a news story. The method first classifies a news headline as informative or non-informative, and then considers text and semantic features of a sentence, such as word frequency, sentence length, location in the text and word co-concurrency with the headline, to evaluate the importance for each sentence and select the most important one as the topic sentence. Experiment results show that this method can identify a topic sentence accurately and the proposed approach makes a good preparation for event information extraction.
Related Articles | Metrics | Comments0